Clause-Based Reordering Constraints to Improve Statistical Machine Translation
نویسندگان
چکیده
We demonstrate that statistical machine translation (SMT) can be improved substantially by imposing clause-based reordering constraints during decoding. Our analysis of clause-wise translation of different types of clauses shows that it is beneficial to apply these constraints for finite clauses, but not for non-finite clauses. In our experiments in English-Hindi translation with an SMT system (DTM2), on a test corpus containing around 850 sentences with manually annotated clause boundaries, BLEU improves to 20.4 from the baseline score of 19.4. This statistically significant improvement is also confirmed by subjective (human) evaluation. We also report preliminary work on automatically identifying the kind of clause boundaries appropriate for enforcing reordering constraints.
منابع مشابه
A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation
This paper explores a simple and effective unified framework for incorporating soft linguistic reordering constraints into a hierarchical phrase-based translation system: 1) a syntactic reordering model that explores reorderings for context free grammar rules; and 2) a semantic reordering model that focuses on the reordering of predicate-argument structures. We develop novel features based on b...
متن کاملNovel Reordering Approaches in Phrase-Based Statistical Machine Translation
This paper presents novel approaches to reordering in phrase-based statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical translation model. Using this model, we follow a phrase-based monotonic machine translation approach, for which we develop an efficient and flexible reordering framework that allows to easily introduce dif...
متن کاملComparing Reordering Constraints for SMT Using Efficient BLEU Oracle Computation
This paper describes a new method to compare reordering constraints for Statistical Machine Translation. We investigate the best possible (oracle) BLEU score achievable under different reordering constraints. Using dynamic programming, we efficiently find a reordering that approximates the highest attainable BLEU score given a reference and a set of reordering constraints. We present an empiric...
متن کاملDivide and Translate: Improving Long Distance Reordering in Statistical Machine Translation
This paper proposes a novel method for long distance, clause-level reordering in statistical machine translation (SMT). The proposed method separately translates clauses in the source sentence and reconstructs the target sentence using the clause translations with non-terminals. The nonterminals are placeholders of embedded clauses, by which we reduce complicated clause-level reordering into si...
متن کاملRule-based Reordering Constraints for Phrase-based SMT
Translation results suffer when a standard phrase-based statistical machine translation system is used for translating long sentences. The translation output will not preserve the same word order as the source, especially between a language pair that has different syntactic structures. When a sentence is long, it should be partitioned into several clauses, and the word reordering during the tra...
متن کامل